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Abstract — The chain rule for the Shannon and von Neumann 
entropy, which relates the total entropy of a system to the 
entropies of its parts, is of central importance to information 
theory. Here we consider the chain rule for the more general 
smooth min- and max-entropy, used in one-shot information 
theory. For these entropy measures, the chain rule no longer 
holds as an equality, but manifests itself as a set of inequalities 
that reduce to the chain rule for the von Neumann entropy in 
the i.i.d. case. 

Index Terms — smooth min- and max-entropies, chain rules. 



I. Introduction 

IN classical and quantum information theory, entropy mea- 
sures are often used to characterize fundamental informa- 
tion processing tasks. For example, in his groundbreaking 
work on information and communication theory [1], Shannon 
showed that entropies can be used to quantify the memory 
needed to store the (compressed) output of an information 
source or the capacity of a communication channel. It follows 
immediately from the basic properties of the Shannon entropy 
that the equality 

H(AB) = H{A\B) + H(B) , 

which we call the chain rule, must hold. Here, H(B) denotes 
the entropy of the random variable B and H(A\B) is the 
entropy of the random variable A averaged over side informa- 
tion in B. The chain rule therefore asserts that the entropy of 
two (possibly correlated) random variables, A and B, can be 
decomposed into the entropy of B alone plus the entropy of A 
conditioned on knowing B. More generally, one may average 
over additional side information, C, in which case the chain 
rule takes the more general form 



H(AB\C) = H(A\BC) + H{B\C) 



(1) 



The chain rule forms an integral part of the entropy calculus. 
The other basic ingredient is strong sub-additivity, which can 
be written as H(A\BC) < H(A\C), i.e. additional side 
information can only decrease the entropy. 

The quantum generalization of Shannon's entropy, the von 
Neumann entropy, inherits these fundamental properties. For 
a quantum state 1 pa on A, the von Neumann entropy is 
defined as H(A) p := — ti(pA log pa), where tr denotes 
the trace and log is taken in base 2 throughout this paper. 
The conditional von Neumann entropy with classical side 
information can again be defined by an average, however, this 
intuitive definition fails if the side information is quantum. 
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Pointing to its fundamental importance, the conditional von 
Neumann entropy is thus defined by the chain rule itself, i.e. 
H(A\B) p := H(AB) p - H(B) P . In addition to the chain rule 
and strong sub-additivity, it also satisfies a duality relation: 
For any pure tripartite state pabc, we have H(A\B) P = 
-H{A\C) P . 

Shannon and von Neumann entropies have been success- 
fully employed to characterize an enormous variety of infor- 
mation theoretic tasks, many of which are of high practical 
relevance (examples include the aforementioned tasks of data 
compression or channel coding). However, a basic assumption 
usually made in this context is that the underlying random 
processes (e.g., those relevant for the generation of data, or the 
occurrence of noise in a communication channel) are modeled 
asymptotically by an arbitrarily long sequence of random vari- 
ables that are independent and identically distributed (Ltd.). 
In the absence of this assumption (e.g., if a channel is only 
invoked a small number of times or if its noise model is 
not i.i.d.), the use of the von Neumann entropy is generally 
no longer justified. The formalism of smooth min- and max- 
entropy, introduced in [2]-[4] and further developed in [5]-[8], 
overcomes this limitation and enables the analysis of general 
situations beyond the i.i.d. scenario. This level of generality 
turned out to be crucial in various areas, e.g., in physics 
(where entropies are employed for the analysis of problems 
in thermodynamics [9]) or in cryptography (where entropies 
are used to quantify an adversary's uncertainty). 

Smooth min- and max-entropy, denoted H^ in and i?,j lax , 
respectively, depend on a positive real value e, called smooth- 
ing parameter e (see Section II for formal definitions). When 
the entropies are used to characterize operational tasks, the 
smoothing parameter determines the desired accuracy. For 
example, the smooth min-entropy, H^^AlB), characterizes 
the number of fully mixed qubits, independent (i.e. decoupled) 
from side information B, that can be extracted from a quantum 
source A [10], [11]. Furthermore, the smooth max-entropy, 
H^ ax (A\B), characterizes the amount of entanglement needed 
between two parties, A and B, to merge a state pab, where 
pA is initially held by A, to B [11], [12]. In both cases, the 
smoothing parameter e corresponds to the maximum distance 
between the desired final state and the one that can be 
achieved. 

Smooth entropy can be seen as strict generalization of 
Shannon or von Neumann entropy. In particular, the latter can 
be recovered by evaluating the smooth min- or max-entropy 
for i.i.d. states [3], [6]. Accordingly, smooth entropy inherits 
many of the basic features of von Neumann entropy, such 
as strong sub-additivity. In light of this, it should not come 
as a surprise that smooth entropy also obeys inequalities that 
generalize the chain rule (1). Deriving these is the main aim 
of this work. 
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Specifically, one can obtain four pairs of generalized chain 
inequalities. For any small smoothing parameters e',e",e"' > 
and e > e' + 2e", we have 

H^- m (AB\C) p > H^(A\BC) P + H< n (B\C) p - f , 
H^ X (AB\C) P < H^ ax (A\BC) p + H^(B\C) P + / , 

H< n (AB\C) p < H* mln (A\BC) p + H£jB\C) p + 2/ , 
H^ X (AB\C) P > H<,(A\BC) P + H^(B\C) P - 2/ , 

H^ in (AB\C) p < HCAA\BC) P + H^ n (B\C) p + 3/ , 
HLa,x(AB\C) p > H^ ax (A\BC) p + H^ in (B\C) p - 'if , 

H< n (AB\C) p < HCAA\BC) P + H^ X (B\C) P + g , 
Hi ax {AB\C) p > H^ n (A\BC) p + H^ m (B\C) p - g , 

where / does not grow more than of the order log 1 /e when 
e = e — e' — 2e" is small, and g is smaller than 6 for e' + 
2e" + e'" < 1/5. We note that, in typical applications, we 
would choose the smoothing parameters so that the correction 
terms / and g are small compared to the typical values of the 
smooth entropies. 

The fact that generalized chain inequalities hold for smooth 
min- and max-entropy is not only important for establishing 
a complete entropy calculus, analogous to that for the von 
Neumann entropy. They are also crucial for applications. 
However, until now, only special cases of these inequalities 
have been known, except for the first pair, which has been 
derived in [11]. In the present paper we provide proofs for 
the remaining relations. In fact, since smooth min- and max- 
entropy obey a duality relation similar to that of von Neumann 
entropy, H^ in (A\B) = -H^ X (A\C) (see Lemma 5), the 
paired inequalities above imply each other. It will therefore 
suffice to prove only one inequality of each pair. 

The paper is organized as follows. In the next section 
we introduce the notation, terminology, and basic defini- 
tions. In particular, we define the (smooth) min- and max- 
entropy measures and outline some of their basic features. 
In Section III we derive alternative expressions for the max- 
entropy based on semidefinite programming duality. While 
these expressions may be of independent interest, they will 
be used in Section IV, which is devoted to the statement and 
proofs of the generalized chain rules. 

II. Mathematical Preliminaries 
A. Notation and basic definitions 

Throughout this paper we focus on finite dimensional 
Hilbert spaces. Hilbert spaces corresponding to different phys- 
ical systems are distinguished by different capital Latin letters 
as subscript Ha,Hb etc. The tensor product of Ha and Hb 
is designated in short by Hab = Ha <8> Hb- 

The set of linear operators from Ha to Hb is denoted 
by £{Ha,Hb)- The space of linear operators acting on 
the Hilbert space H is denoted by C(H) and the sub- 
set of C(H) containing the Hermitian operators on H is 
denoted by Hcrm('H). Note that Hcrm('H) endowed with 
the Hilbert-Schmidt inner product (X, Y) := ti(X^Y), 



X,Y € Hcrm('H), is a Hilbert space. Given an operator 
R e Herm(H), we write R > if and only if R is 
positive semi-definite and R > if and only if it is positive 
definite. Furthermore, let S<(H) and S = (H) denote the sets of 
sub-normalized and normalized positive semi-definite density 
operators with tr p < 1 and tr p = 1, respectively. 

We generalize the notion of inequality to Hermitian oper- 
ators in the following way: Let R,Se Hcrm('H), then we 
write R > S, respectively R > S if and only if R — S is 
positive semi-definite, respectively positive definite. 

Given an operator R, the operator norm of R is denoted by 
1 1 R | |oo and is equal to the highest singular value of R. The 
trace norm of R is given by ||i?||i := tr[VWR}. The fidelity 
between two states p, a 6 S<(H) is defined as F(p,a) := 

IIV/V^li- 

For multipartite operators on product spaces Hab we will 
use subscripts to denote the space on which they act (e.g. 
Sab for an operator on Hab)- Given a multipartite operator 
Sab S £(Hab), the corresponding reduced operator on Ha 
is defined by Sa '■= trsfSUs] where tr# denotes the partial 
trace operator on the subsystem Hb- Given a multipartite 
operator Sab an d the corresponding marginal operator Sa, 
we call Sab an extension of Sa- We omit identities from 
expressions which involve multipartite operators whenever 
mathematically meaningful expressions can be obtained by 
tensoring the corresponding identities to the operators. 

B. Smooth Min- and Max-Entropies 

In the following we successively give the definitions of the 
non-smooth min- and max-entropies and their smooth versions 
[3], [5]. 

Definition 1. Let pab G S<(Hab), then the min-entropy of 
A conditioned on B of pab is defined as 

H min (A\B) p := max H m i n {A\B) p \ ai where 

"b£S< {'Hb) 

H min (A\B) P \ a := sup{A eR:p AB < 2~ X I A ®a B }- (2) 

Note that i/ min (A|i?) )!) | cr is finite if and only if supp(p B ) C 
supp(ere) and divergent otherwise. 

Definition 2. Let pab £ S<{Hab), then the max-entropy of 
A conditioned on B of pab is defined as 

H max (A\B) p := max H max (A\B) p]cr , where 

<?b<£S<(Hb) 

H max (A\B) pW := log F(pabM®°b) 2 - (3) 

The maximum in (2) and (3) is achieved at S = (Hb)- The e- 
smooth min- and max-entropies of a state p can be understood 
as an optimization of the corresponding non-smooth quantities 
over a set of states e-close to p. We use the purified distance 
to quantify the e-closeness of states. 

Definition 3. Let p, a € S< (H). Then the purified distance 
between p and a is defined by 

P(p, a) := ^l-Fip^Y, where (4) 
F(p, a) := F(p, a) + yj (l - tr p) (l - tr a) (5) 
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is the generalized fidelity. 

From now on, when two states p,a € S< (H) are said to be 
e-close we mean P(p, a) < e and denote this by p ps e a. Some 
of the basic properties of the purified distance are reviewed 
in Appendix B, but for a more comprehensive treatment we 
refer to [7]. With that convention we are ready to introduce a 
smoothed version of the min- and max-entropies [3]. 

Definition 4. Let e > 0, pab £ S<{Rab)- Then the e-smooth 
min-entropy of A conditioned on B of pab is defined as 



H^ in (A\B) p := maxH min (A\B), 



(6) 



and the e-smooth max-entropy of A conditioned on B of pab 
is defined as 



H^(A\B) p := min H max (A\B) f 



(7) 



where the maximum and the minimum range over all sub- 
normalized states pab ~e Pab- 

The smooth min- and max-entropies are dual to each other 
in the following sense [7]: 

Lemma 5. Let e > 0, pab £ S<(Hab) and pabc £ 
S<{Habc) an y purification of Pab- Then, 



H^ x (A\B) p = -H^ min (A\C) 



(8) 



Finally, the smooth min-entropy is upper-bounded by the 
smooth max-entropy as shown by the following lemma whose 
proof is deferred to Appendix A: 

Lemma 6. Let pab £ <S<(Rab) and e, e' > such that 
e + e' + 2-/I - tr pab < 1 • Then, 

H< n (A\B) p < H e max (A\B) p 
+ log 



1 - (e + e' + 2Vl -trp) : 



(9) 



C. Semidefinite Programming 

This subsection is devoted to the duality theory of semi- 
definite programs (SDPs). We will present the subject as given 
in [13] and especially in [14] but will restrict the discussion 
to the special case which is of interest in this work. 
A semidefinite program over the Hilbert spaces Ha and Hb is 
a triple {F,R A ,S B ), T £ £(Herm(ftA),Herm(W B )), R A G 
Herm^T^) and Sb G Hermes), which is associated with 
the following two optimization problems: 



Primal Problem: 

minimize: tx[RAX A ] 
subject to: T(X A ) > S B 
X A > 



Dual problem: 

maximize: ti^SsYe] 
subject to: J^(Y B ) < Ra 
Y B >0 



where Xa G Herm(%yi) and Y B G Hcrm(H B ) are 
variables. X A > and Y B > such that T(X A ) > S B 
and (Y B ) < Ra, respectively, are called primal feasible 
plan and dual feasible plan, respectively. We also denote the 
solutions to the primal and dual problems by 

7 := inf {tr^^X^] : Xa is a primal feasible plan}, 



S := sup{tr[5s Y B ] : Y B is a dual feasible plan}. 

The values Xa > and Y B > satisfying tr^^X^] = 7 
and ti[S B Y B ] = S are called primal optimal plan, respectively 
dual optimal plan. 

According to the weak duality theorem 7 > 6. The difference 
7 — 5 is called duality gap. The following theorem called 
Slater's condition establishes an easy-to-check condition under 
which the duality gap vanishes, that is, 7 = S. 

Theorem 7. Let 7 and S be defined as above and ( J- ' , Ra, Sa) 
with Ra G Herm(H J 4) and S B £ Hermes) a semi-definite 
program. Then the following two implications hold: 

(i) [Strict dual feasibility] Suppose 7 is finite and that there 
exists an operator Y B > such that J 7 ^ (Y B ) < Ra- Then 
7 = 5. 

(ii) [Strict primal feasibility] Suppose that S is finite and that 
there exists an operator Xa > such that T{Xa) > S B . 
Then j = 5. 

III. New Expressions and Bounds for the Smooth 
Max-Entropy 

In the following, we give alternative expressions for 
H max (A\B) p \ a and H maK (A\B) p based on the analysis of 
their corresponding SDPs. Then, we prove inequalities relating 
these entropies with a new entropic measure that turns out to 
be a useful tool for proving the chain rules. 

A. New Expressions via SDP Duality 

Lemma 8. Let pas £ S<{Hab), ob £ S<(Hb) and let 
PABC be a purification of pab on an auxiliary Hilbert space 
He- Then the max-entropy of A conditioned on B of pab 
relative to gb is given by 

H max (A\B) pW = logmintT[(I A ®(rB)ZAB], (10) 

Zab 

where the minimum ranges over all Zab £ 'P{'Hab) with 
Pabc < Zab ® Ic- 

Proof: Uhlmann's theorem [15] tells us that the fidelity 
can be expressed as a maximization of the overlap of purifi- 
cations in which the optimization goes over one purification 
only. In particular, if pabc is a purification of pab, then 



2^WA|B) pk=F(pAB)lA0(7B)2 



max 

Xabc>0 

tr C [X A Bc]=^A ®CTB 

max 

Xabc>0 
^c[Xabc]=^a ®<?b 



F{pabci XabcY 



^[pabcXabc], 



(ID 



where we optimize over all rank one extensions Xabc of 
Ia®&b- Notice that instead of optimizing over pure states 
only we can let the maximization range over all positive 
semidefinite operators Xabc since by Uhlmann's theorem we 
can always pick up large enough purifying system Re such 
that there exists an optimal rank one Xabc- Furthermore, for 
any positive semidefinite operator Xabc with t r c [^abc] < 
1a ®<Jb we can define an operator 

Xabc ■= X A bc + Y c ® (Ia ®c~b - tr c X A bc) , 
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with Yc an arbitrary element of S = (Hc)- By construction it 
is constrained by trc Xabc — I A ®o~b and also satisfies 

tr [XabgPabc] > tr[X A BcPABc] ■ 

Hence, in (11) we can take the maximum over the set of all 
nonnegative operators Xabc whose partial trace trc Xabc 
is bounded by l A <8><tb (in spite of being equal to Ia <8><tb). 
The SDP for the conditional max-entropy of a state pab G 
S<{Hab) relative to a given as G S<(Hab) is as follows: 



Primal Problem: 
minimum: tr[(I^ ®gb)Z ab \ 
subject to: Zab 65 Ic > PASC 



Zab > 0. 



Dual problem: 
maximum: tr[X ABC p ABC ] 
subject to: trc [.Xabc] < I A ®o\B 
-Xabc > 



where is a primal variable and Xabc a dual vari- 

able, respectively. Since the space in the dual problem 
over which one is optimizing, is closed and bounded, it 
is compact by the Weierstrass theorem. Hence, the dual 
optimal plan is finite. Furthermore, the operator Zab = 
2\\pabc\\oo^-ab > satisfies Slater's strict primal feasibility 
condition 2\\pabc\\oo^abc ^Pabc > and thus the dual- 
ity gap between the primal and dual optimization problems 
vanishes. ■ 
Next, we write out the SDP for H max (A\B) p and explore 
the duality gap between the optimization problems. 

Lemma 9. Let pab G S<(Hab) and let pabc be a 
purification of pab on an auxiliary Hilbert space We- Then 
the max-entropy of A conditioned on B of pab is given by 



H lna , x (A\B) p := logmm||Z B 



Zai 



(12) 



where the minimum ranges over all Zab G T > {'Hab) with 
Pabc < Zab ® He. 

Proof: The only thing that changes with respect to the 
SDP in Lemma 8 is that a B is no longer fixed but it becomes 
a dual variable. Thus the SDP for H mllx (A\B) p reads: 



Primal Problem: 
minimum: A 
subject to: Z AB ® l c > Pabc 

\I b > tr A [Z AB ] 
Z ab > 0, A > 



Dual problem: 
maximum: tr[X A BCPABc] 
subject to: tr c [X AB c] < I A ® CT B 
tr[<r s ] < 1 
-Xabc >0,a B >0 



where A and Zab are primal variables and a B and 
Xabc dual variables. Obviously, the optimal A is equal to 
the largest eigenvalue of Z B - Hence, the above program may 
be rewritten in the form: 



Primal Problem: 
minimum: 1 1 Zb \ | oo 

subject to: Z AB 69 Ic > PABC 
Z AB > 



Dual problem: 
maximum: tr[X AB cPABc] 
subject to: tr c [X AB c] < I A ®<*B 
tr[a B ] < 1 
X AB c > 0, <t b > 



In the dual problem we are optimizing over compact 
sets, thus there exists a finite dual optimal plan. Furthermore, 
Zab = ZWpabcWoo^ab > and A = 211.2b > satisfy 
Slater's strict primal feasibility condition Zab <8> Ic > Pabc 
and \ \b > tr a[Zab] which implies a zero duality gap. ■ 



Note that one can always write the operator norm of Zb as 

1 1 I loo = maxtr[cr B Z B ] = maxtT[(I A ®cr B )Z AB ], 

where the maximum ranges over all a B E S<(H B ). Expres- 
sion (12) then acquires the form 

H max (A\B) p = log min maxtr[(I A ®a B )Z AB \. 

Pabc<Zab&c a B 

(13) 

On the other hand from the vanishing of the duality gap in 
the SDP of H max (A\B) plrJ it follows that 

\ogF{p A BjA®o B ) 2 = logvaintT[(I A ®(T B )Z AB ] 

Zab 

which after maximization of the left- and the right-hand sides 
over cx B G S<(Hb) implies 



-ffmax(^4|-E>) p 



log max min tr^I^ ®u B )Z AB ] 

g b Zab 



Therefore, the operations min and max in (13) commute. In 
the following we restate Lemma 6 and prove it using the 
derived SDPs for the non-smooth max-entropy. 

Henceforth, we will use (3), (10) and (12) and (13) as 
interchangeable expressions for the conditional max-entropy 
and the conditional relative max-entropy, respectively. 



B. A Bound on the Relative Conditional Entropy 

We prove an important technical lemma, which we later use 
to derive one of the chain rules. 

Lemma 10. Let pab G S<(%ab), Pab ~e' Pab and e > 0. 
Then there exists a state p AB ~e+e' Pab sucn m °t 



H m&x {A\B) p < H max (A\B) 



p\p' 



log 



1 



i - VT^P 



(14) 



Proof: Let Zab be an optimal primal plan for the 
semidefinite program for H max (A\B) p \ p i and He be the 
minimum rank projector onto the smallest eigenvalues of the 
reduced operator Zb such that tr[n^p^] < 1 — yl — e 2 
where is the orthogonal complement of and let 
Pab '■= HbPab^Ib- By Equation (12), we can write 

2 H m (A\B )p = min || Zb | U 

pabc<z a b®^c 

< \\UbZbIJb\\oo, 

where we used the fact that pabc < Zab ® Ic implies 
Pabc < ^bZab^b ® Ic- Let IJ' B be the projector onto the 
largest eigenvalue of U b Zb^Ib- Since U' B and project on 
eigenvectors of Zb, they commute with Zb- Then, 



\h b z b iib\\oo = tr[n B z B ] 

tr\p B Z B ] 



(15) 



mm ■ 

Pb 



tr[/Xi 



where the minimization is over all positive operators in the 
support of + U' B . Fixing p B = (n£ + ll' B )p' B (n£ + n' B ), 
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we obtain the following upper bound for (15): 

tr[(Ili + U' B )p' g (n^+Il' g )Z B ] 



m B z B n B \ 



< 



tr[(n^+n' B y B ] 



< 



tijinj, + u' B )z B /z P ' B z 
tr[(n^ + n' B ) P ' B ] 

ti[p' B Z B ] 



a/2i 



tr[(n£ + n' B y B ] 



< 2 «max(A|B) 



where we used Equation (10) and the fact that tr[(Ug + 
H' B )p' B ] > 1 — Vl — e 2 by definition of n B . Then, taking 
the logarithm on both sides yields (14). 

Finally, the proof is concluded by the upper bound 

P{pab,Pab) = P(RbPa B 'R B ,Pa B ) 

< p{n B p AB ii B , n BP ' AB Ti B ) + p(n B p AB n B ,p AB ) 



< P(pab,p' ab ) + y/2tT[U^p' AB ] - (tr[Uip' AB ]y 

< e' + e 

where we use Inequality (34) and the fact that the function 
y/2t — i 2 is monotonously increasing in the interval [0, 1]. ■ 

C. The e-Smooth S-Entropy 

To prove the chain rules, we also need an auxiliary entropy 
measure called e-smooth S'-entropy 2 whose definition and 
basic properties are given in the following. 

We assume that p AB G S<(Hab) and a B G S<(Hb) 
with supp(ps) C supp(crs) and denote for every A G R the 
projector onto the eigenspace corresponding to the negative 
eigenvalues of the operator 2 x pab — <J B by P AB . 

Definition 11. Let e > 0. Then the e-smooth S-entropy of A 
conditioned on B of pas relative to a B is defined as 

S s (A\B) pW := inf{A G R : tv[P^ B p AB ] < e}. (16) 

We would like to upper bound this new quantity in terms 
of the max-entropy. In order to achieve this we prove the 
following technical lemma: 

Lemma 12. Let e > and Amf G R the infmium as in 
Definition 11, then there exists a number A G R such that 
A > A inf and tr[P\ B p AB \ > e. 

Proof: From the assumption supp(pA B ) Q 
supp(lA ®ctb) it follows that one can always find a 
sufficiently small real number A so that the operator 
2 x pab — vb becomes negative definite on Ha ® supp((Ts). 
For any such A we trivially have ti[P AB pA B ] > s. Define 
-^sup : = sup {A G R : tv[P AB p AB ] > e} and assume that 
-^sup < A; n f. Then for every A G (A sup ,Ai n f) we would 
have trfP^p^s] < e which however contradicts the 
assumption that Ai n f is an infimum. Therefore, we conclude 
that Ainf < A sup . Thus one can always find a sufficiently 

2 The idea for this new entropy measure was originally proposed by Robert 
Konig. 



small (5^0 such that the number A := A sup — <5 > A in f and 
^[-Pab/^b] > e holds which concludes the proof. ■ 
The next lemma gives the upper bound of the e-smooth 
S'-entropy in terms of the max-entropy. 

Lemma 13. Let pab G S<(TIab), <Jb G S<(H b ) and s > 0. 
Then, 



S s {A\B) p]a < H max (A\B) 



(17) 



Proof: Let A; n f G R be the infimum in Definition 11, 
that is, Ai n f = S e {A\B)p\ a , A be as in Lemma 12 and P AB 
denote the projector onto the positive/negative eigenvalues of 
pA B — 2~ A (7b, respectively. Then, a straightforward compu- 
tation yields 



2 ^H Ia , x (A\ B ) p ^-^S^A\B) 



> tv[y/pA B V2- X C7 B ] 

> tr[Pi B 2- X <J B + P- AB p AB \ 

> ^[PabPab] 

>e, (18) 

where we applied Corollary 18 and used the fact that P AB 
is identical with the projector P AB . Taking the logarithm on 
both sides of (18) and rearranging the terms we obtain (17). 



IV. Main Results 

This section contains the main result of this paper: a 
derivation of the previously unknown chain rules for smooth 
min- and max-entropies. To simplify presentation hereafter, 
we introduce the function 



/ : e ^ log 



1 



i - VT^, 



that appears as an error term in the chain rules. It vanishes as 
£ — > 1 and grows logarithmically in - when e — > 0. 

As remarked in the introduction, the explicit form of two 
of the chain rules has already been derived in [11], namely 



e+c'+2e" 



(AB\C) P > H^ in (A\BC) p + H^ min (B\C) p - /(e) 



and its dual. Here we provide proofs for the remaining three 
pairs of chain rules. Due to the smooth duality relation (8) it 
is enough to prove only one of each pair. 

Theorem 14. Let e > 0, e', e" > and pasc € S<{Ua B c)- 
Then, 



H^ in (AB\C) p < H, 



e+e'+2e" 



(A\BC) p +H^ x (B\C) p +2f(e 



(19) 



Proof: Let p' ABC w e / pasc- P BC Pbc and a c G 
S<{H C ) such that 

H niin (AB\C) p > = H< U {AB\C) P , 
H max (B\C) p n = H^ ax (B\C) p , 
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and 

r'abc < 2~ H ^ AB \ c ^'a c = 2- H ^ AB W»a c . (20) 

For every 5 > there is a 5' <G (0,5} such that the projector 
P BC onto the negative eigenvalues of the operator 2 x p BC — ac 
with A := S e (B\C) p n\ a + 5', e > 0, satisfies the constraint 
^[PbcPbc] — £ m Definition 11. If P^c 15 me orthogonal 
complement of P BC , we have 

PBC°cPhc < 2 x P x ^p f BC P x ^. (21) 
A conjugation of (20) with P B q together with (21) yields 

PhhPABcPhh < 2- h ^ ab ^P^ P 'bcP^ , 
which is equivalent to 

Hmin(A\BC)p\± p ,p\±\p\± p iip\± > H^ in (AB\C) p — A. 

A subsequent optimization of the left-hand side over all 
S<{Ubc) yields 

H m i n (A\BC) P \± p , P x± > H^ in (AB\C) p — A (22) 

Since Pabc is an extension of pbc, by Corollary 22 there ex- 
ists an extension p" ABC of p" BC such that P(p'abC' Pabc) = 
P(p'bc> Pbc)- Then Inequality (33) and Inequality (34) give 
us the following upper bound for the purified distance between 

P BCPABC P BC and PABC- 

P(PbcPabc p bc> Pabc) 

< P(PbcP'abcPbc,PbcPabcP^) 
+ P(PbcPabcPbci PbcP'abcPbc) 
+ P (Pbc Pabc Pbc > Pabc) 

+ P(PaboPabc) 

< \j2e - e 2 + e' + 2s". 

After smoothing the left-hand side of (22) and upper-bounding 
the term S e (B\C)pn\ a on the right-hand side of (22) by 
H ma x(B\C)p"\ a in accordance with Lemma 13 and subse- 
quently optimizing it over S<(Hc), we obtain 

Hi- a (AB\C) p < H^t^^' + ^"(A\BC) P + H S ^{B\C) P 

+ log ^ +6'- 

Finally, the substitution s := 1 — \J\ — e 2 leads to the chain 
rule (19) in the limit S — > 0. ■ 

Theorem 15. Let e > 0, a', e" > and p AB c e S<(Habc)- 
Then, 

H< n (AB\C) p < H^ X (A\BC) P +H 2 n ^'+^" (B\C) p +Sf(e) . 

(23) 

Proof: Let pabcd be a purification of pabc- If 

Hi^(AB\D) p > H^+/+^"(B\AD)p+H^ n (A\D) p -3f(e) 

holds, then the chain rule follows by the duality relation (8). 
Let Pabd ~e' Pabd-i Pad ~e" Pad and a D e S<(H D ) s.t. 

H max (AB\D) p , = H^ X (AB\D) P , 
H min (A\D) p „ = H^ n (A\D) p , 



and 

Pad < 2~ H ^ A \ D ^"a D = 2~ H ^ A \ D ^a D . (24) 

Again we use the fact that for every S > there exists a 
8' e (0, 6} such that for A := S E (AB\D) p , w + 5', i > , the 
projector Pabd onto tne negative eigenvalues of the operator 
2 x p' ABD — gd satisfies the constraint ^[PabdP'abd] — £ m 
Definition 11. If Pabd denotes the orthogonal complement of 
Paed- then 

2 X Pabd Pabd Pabd > Pab d& d Pab d ■ (25) 

A conjugation of (24) with Pabd an( I a subsequent combina- 
tion with (25) yields 

0*-H^ itL (A\D) p pA_L / p A_L >pA_L // p A_L nf , 

L ^abdPabd^abd ^ ^abdPad^abd- \ z0 ) 

Consider now the max-entropy 

2 H^(B\AD)^ ±plpX±lp „ = tr[(I B ® Pad )Zabd] 

Zabid>0 
p \bdP'abcd p \bd< z abd®\c 

(27) 

where Pabcd 15 a purification of p'abd- Making use of (26) 
and the inequality 

PabdP'abcdPabd < Pabd ® 

and omitting the identity operator, we can upper-bound the 
right-hand side of (27) in the following way: 

S ^ mm p 1i [^abdPabd^abd\ 

where we use that the term ^[PabdP'abdPabd] 15 upper 
bounded by one. Taking the logarithm and substituting A yields 

H Tnax (B\AD) pX±plp ^ pll < S e (AB\D) p , w + 5' - H< U (A\D) P . 

A subsequent application of Lemma 13 implies 

H max (B\AD) P xx p ,pxx ]p „ < H max (AB\D)p, - H< n (A\D) p 

+ S' + log 1 , (28) 

where the max-entropy term on the right-hand side has been 
optimized on S<(Hd)- Consider now the left-hand side of 
(28). Corollary 22 guarantees the existence of an extension 

Pabd such that p (Pad^Pad) = P(p A bd> Pabd)- Then, it 
follows that 

P(PabdPabdPabdj Pabd) — P(PabdPabdPabd> Pabd) 

+ P(p'abdi Pabd) 
< \/2e- e 2 +s' + e". 

Thus, according to Lemma 10, there exists a state 

PABD ~ e+ ^2i^+e'+2e" PABD Such that 

H max (B\AD) p < Hi^(AB\D) p - H< n (A\D) p 

+s'+ log 4 +m. 
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Smoothing of the left-hand side and regrouping the terms in 
the last inequality yields 



H^(AB\D) p > H, 



:+\/2£-£ 2 +£' +2s" 
a ax 

1 



(B\AD) p + H^ n (A\D), 



-S'-log^-f(e) 



Finally, setting e := 1 — \/l — e 2 , taking the limit S — > 0, 
and applying the duality relation for smooth entropies (8), we 
obtain chain rule (23). 

■ 

The last chain rule follows from chain rule (19) together 
with Lemma 6. 

Corollary 16. Let e', e" e'" > and p A BC e S<(Habc) 
such that e' + 2e" + e'" < 1 - 2^/1 - tr p. Then, 

H< n (AB\C) p < HCAA\BC) P + HCAB\C) P 

+ g(e',s",e"\trp), (29) 



where g(e , s ,e , tr p) := 

^ { 2/(£) + l0g ( l-( £ + £ ' + 2 £ " + W 2 ,/l-tr O )0 }' 



( £ + £ '+2 £ "+ £ "' + 2 x /l-tr /9 ) 2 

and the infimum is taken in the range 0< £ <1 — e' — 2 £ " — 
e'" - 2^/1 -trp. 

Proof: Let e > be any smoothing parameter such that 
e < 1 - e' - 2e" - e'" - 2^/1 - tr p. Then, by Lemma 6, 
the smooth min-entropy term on the right-hand side of (19) is 
upper bounded by 

H^" (A\BC) + log ( - , 

maxV I JP^ & Vl - (e + E 1 + 2e" + E 1 " + 2 y/1 - tr p) 2 

which immediately gives (29). ■ 
In contrast to the previous chain rules, the last one leads to 
non-trivial results even if we apply it to non-smooth entropies. 
In particular, for a normalized state pabc, we find 

H min (AB\C) p < H max (A\BC) p + H max (B\C) p + 4 . 
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Appendix A 
Proof Lemma 6 

Restatement of Lemma 6. Let pab <E S< {T-Lab) and e, e' > 
such that e + e' + 2-^/1 — tr pab < 1 • Then, 



X {A\B) P < H ma 



log 



(A\B) 



- ( £ + £ ' + 2 x /T _ tr^) 2 , 

Proof: Define pab = Pab/^(pab)- According to 
Lemma 5.2 in [16] there are embeddings U : Ha — > *rlA' 



and V : %b — > "rl b 1 sucn that there exists a normalized state 
PA'B' ~ e PA'B', where p A>B , = {U ® V) p A B (U^ ® V" t ), 
which minimizes the smooth max-entropy H^ ax (A'\B')p = 
H^(A\B) p . 

Consider now the quantity 2~ H >™ ( A I s >p. We are simulta- 
neously minimizing over all a B ' G S<(Hb') and all states 
PA'B', that are e + £ '-close to the normalized state pa'B'- 
By Uhlmann's theorem the latter constraint translates into 
tr[p~A'B'CPA'B>c] > 1 — (e + e") 2 ; where He is a purifying 
system. We can formulate 2~ H '™ (A'\B') p as the following 
semidefinite program: 



Primal Problem: 
minimum: tr[I B / o B i\ 

subject to: l A , ®<r S / > trc[pA'B'c] 

^[pa'b'cPa'b'c] > 1 - (e + e') 2 

tl[pA'B'c] < 1 

cr B i > 0, pa'b'c > 

Dual problem: 
maximum: (1 — (e + e') 2 )\ — p 

subject to: trA [Ea'b 1 ] < I's 

^PA'B'C < E A 'B' ®Ic +P^A'B'C 

E A 'b' > 0, A, p > 0, 



where ob* and pA'B'C are the primal variables and 
Ea'B 1 , A and p are the dual variables, respectively. Let 
Za'B' be a primal optimal plan for the semidefinite pro- 
gram of H max (A'\B') p , that is Z A 'B' ® lc > PA'B'C and 
\XA'\Z A 'B>\ < 2 H ^ A '\ B '^l B '- Then the variables E A 'B' = 
2- h ^( a '\ b ')pZ A ' B ', A = 2- H ^ A '\ B '^ and p = are a 
dual feasible plan for the above semidefinite program. By the 
weak duality theorem we have then 



(l-( £ + £ ') 2 )2- ff — ( A '\ B 'h <2~ H > 



Taking the logarithm and considering the fact that all states 
which are £ '-close to pa'B' are contained in the ( £ + £')- 
neighborhood of pA'B', we get 

Hi in {A'\B') p < H s Ji'(A'\B% < H e max (A'\B% 

1 \ (30) 

log 



1 - (e + s'y 



By Proposition 5.3 in [16] we have H^ in (A'\B') p = 
Hi in {A\B) p and H^ X (A'\B') P = H^(A\B ) P . Finally, 
substituting in (30) e = e + \J\ — tr(pAB) and e' = 
y/l - tr(pAB) and considering that H^ in (A\B) p < 



H 



+Vi-ti-p 



(A\B) p as well as H 
H^ ax (A\B) p we conclude the proof. 



e+Vl-trp 



(A\B)t 



< 



Appendix B 
Technical Lemmas 

A. Operator inequalities 

Theorem 17 ([17], Theorem 1). Let Q and R be positive 
operators on a Hilbert space % and let < s < 1. Then, 

tr[Q s R}- s ] > ^tr[Q + R-\Q-R\] (31) 
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From this theorem we can draw the following useful corol- 
lary. 

Corollary 18. Let R and Q be positive operators on a 
Hilbert space H, let < s < 1 and let P± denote the 
orthogonal projectors onto the eigenspaces corresponding to 
positive/negative eigenvalues of the operator Q — R, respec- 
tively. Then, 

tr [Q s i? 1_s ] > tr [P+R + P_Q] 
Proof: We make the following decomposition of \Q — R\ 

\Q- R\ = P+{Q- R)P+- P-(Q- R)P~, (32) 

where P± are the projectors onto the positive and negative 
eigenvalues of Q — R, respectively. Substituting (32) in (31) 
and using the fact that P+ + P- — I, we obtain 

tr [QSR 1 - 3 ] >^tr[Q + R-\Q-R\] 

= tr [P-Q+{I-P-)R] 
= tr [P-Q + P+R]. 

■ 

B. Purified Distance: Properties 

Lemma 19 ([7], Lemma 7). If p,a S S<(H) and £ is a trace 
non-increasing CPM on C(R), then 

P(£(p),£(a))<P(p,a). 

Evidently, for any < II < 1 the map defined by p i — > 
IIpII, p E S<(H) is a trace non-increasing CPM. Thus, in 
particular, by the above lemma we have 

P(U P n, Hall) < P{p, a) (33) 

for p, a e S<(%). 

Lemma 20 ([18], Lemma 7). Let p e S<{%) and < II < I. 
Then, 

p(n P n, P ) < ^=V(trp) 2 -(tr[n 2 P ]) 2 . 

When II is a projector, that is II 2 = II, then a straightfor- 
ward computation yields 

P(UpU, p) < ^trp-Lp] - (tr[n-Lp]) 2 (34) 
where II - 1 = I —II is the orthogonal complement of II. 

Lemma 21 ([7], Lemma 8). Let p,a G S<(H), W = U and 
p € 5<('H ® %') be a purification of p. Then, there exists 
a purification a € S<(7i ®W) of a such that P(p,a) = 
P(p,a). 

From that lemma one infers the following corollary: 

Corollary 22. Let p,a e S<(H), W =H and p e S<{H ® 
%') be an extension of p. Then, there exists an extension a G 
S<(H ® W) of a such that P(p, a) = P{p, a). 
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